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A problem of optimal information acquisition for its use in general decision making prob- 
lems is considered. This motivates the need for developing quantitative measures of infor- 
mation sources' capabilities for supplying accurate information depending on the particular 
content of the latter. A companion article developed the notion of a question difficulty func- 
tional for questions concerning input data for a decision making problem. Here, answers 
which an information source may provide in response to such questions are considered. In 
particular, a real valued answer depth functional measuring the degree of accuracy of such 
answers is introduced and its overall form is derived under the assumption of isotropic knowl- 
edge structure of the information source. Additionally, information source models that relate 
answer depth to question difficulty are discussed. It turns out to be possible to introduce a 
notion of an information source capacity as the highest value of the answer depth the source 
is capable of providing. 
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I. INTRODUCTION 

The classical Information Theory was developed as a theory of communication, its main prac- 
tical objective being optimization of communication over imperfect channels. Correspondingly, it 
deals with information quantity, while paying little or no attention to either its accuracy or rele- 
vance. The latter omission is by no means a defect of Information Theory but rather its conscious 
choice: the complete abstraction from any content of transmitted information ensured both the 
theory universality and its notable elegance. On the other hand, besides being transmitted, in- 
formation also gets acquired and used in everyday practice of a variety of fields, including science 
and engineering. This typical path of information, from acquisition, via possible transmission, to 
its usage (to make decisions, generate new knowledge etc.) can be schematically depicted as the 
full information chain shown in Fig. [TJ One can see, that unlike the middle link of this chain, 
the two "end" links do not at this moment enjoy the convenience of being described by any kind 
of universal theory. While several methods for making decisions under incomplete information 
have been developed in considerable detail and used in numerous applications, the overall theory 
providing a unified and explicit treatment of informational aspect of such activity is still largely 
lacking. In fact, the state-of-the-art of (broadly defined) decision making under uncertainty could 
be compared to that of theory and practice of information transmission before the advent of Infor- 
mation Theory in late 40's Various coding schemes (like Morse code, for instance) existed and 
were widely used, but they were developed in largely "trial and error" fashion, and, for example, 
their optimality and theoretical limits were generally unknown. 

One of the goals of the present article is to initiate the development of an information theory of 
the "end" links of the information chain shown in Fig. [TJ This would make it necessary, in particular, 
to explicate the quantitative properties of information accuracy and relevance, in addition to its 
quantity. This article, which a follow-up to [2j, is mostly devoted to developing a theory of the first 
link of the full information chain: the information acquisition link. While a consistent theory of 
the two "end" links of the full information chain appears to require a joint treatment of both links, 
this article, together with [2], develops the basic "machinery" of the information acquisition link 
analysis. It is nevertheless necessary to emphasize that the proposed theory can only be logically 
complete (even at the most basic level of detail) only when the third (information usage) link has 
been considered. This will be the subject of (near) future publications. 

A bit more specifically, in order to give a quantitative description of information acquisition 
in a general setting, the process of information acquisition needs to be formalized. We do it in a 
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reasonably obvious way: by introducing notions of questions that an agent (who is assumed to be 
solving a certain problem) can ask an information source and answers that the information source 
can provide in response. The central concept introduced in ^] is that of question difficulty which 
is a real- valued functional defined on the set of questions. The meaning of question difficulty is 
in that a given source can provide more accurate answers to questions with lower difficulty. This 
article develops a symmetric concept of answer depth which is a functional on the set of answers 
that can be informally thought of as a measure of the amount of "work" a source has to do to 
provide the answer. If the depth is close to the corresponding question's difficulty, the answer is 
very accurate and vice versa. Thus, for a given question, more accurate answers have larger values 
of depth. On the other hand, sources can be characterized with capacity that describes the 
highest answer depth a source can provide in response to any question. 

Both question difficulty and answer depth are source-specific objects that describe the source 
knowledge structure. Assuming that the description is faithful, a source should be expected to give 
answers of equal depth to questions of equal difficulty, regardless of the details of these questions [4|] . 
This implies that, for a given source, the answer depth should be a function of question difficulty. 
We refer to such a function as a source model. The specific shape of the source model would in 
general have to be found experimentally. However, some features of realistic source models can be 
predicted from principles of "consistency with general experience" . We discuss such considerations 
in Section IVIII1 



A. Related work 

i — I 

This article, together with [2(], is an integral part of an effort to extend Information Theory 
from the realm of communications into that of information acquisition and usage for solving prob- 
lems. So it can be looked upon as an extension of classical Information Theory which, as was 
mentioned earlier, is focused primarily on information quantity, with Shannon entropy being the 
central concept involved in proper quantification of the latter. Besides fundamental advances in 
communications, the list of successful application of this and derivative concepts includes (but by 



no means is limited to) new algorithms in computer vision lal, new methods of analysis in clima- 

nil n 

tology [6, 7], physiology |8| and neurophysiology [21]. The concept of pseudoenergy introduced in 
[jj and used in the present article extends that of entropy in the direction of information content 
and provides the foundation for the quantitative description of knowledge possessed by various 
information sources. 

The idea of using additional information to improve the decision quality has been studied in 



10, 



the area of statistical decision making. One can mention applications to innovation adoption 
11]. fashion decisions 12] and vaccine composition decisions for flu immunization [l3 |. Typically, 



the amount of information in these applications is measured simply 
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yy the number of relevant 



151 ] introduced models (for 



observations of certain random variable realizations. Some authors 
instance, the effective information model) for accounting for the actual amount of information 
contained in the received observations. 

The problem of optimal usage of information obtained from experts has been addressed mostly 
in the form of updating the decision maker's beliefs given probability assessment from multiple 



experts 



16l4l9j] and, in particular, optimal combining of expert opinions, including experts with 



incoherent and missing outputs [20]. In particular, investigations on combining information of 



211 ] and on rules of updating probabilities based on 



experts that partition the event differently 

outcomes of partially similar events [3] are close in spirit to the approach developed here in 
that they deal with different types of information. The emphasis of the proposed approach is on 
optimizing on the particular type of information and on the explicit consideration of the dependence 
of the optimal information on both for the expert and the decision making problem. 

This article uses an axiomatic approach to determine the overall form of the answer depth 
functional. The latter, together with the related concept of question difficulty studied in {2], can 
be thought of as a logical development of the entropy concept of information theory. The axiomatic 
approach was used in 



231 ] to derive the most general form of the (Shannon) entropy function. A 
different set of axioms was used in [24| to find the one-parameter family of functions (known 
as Renyi entropies) that included standard entropy as a special case. The concept of structural 



entropy was introduced in 



251 ] and used for classification purposes. The Havrda-Charvat entropy 



was derived by axiomatic means in |2{j] where axiomatization of partition entropy was discussed 
on rather general grounds (see also [23]). 

Information Physics (see [3] for a recent fairly comprehensive review) is a relatively recently 
developed branch of physical sciences focused on the role of information in fundamental laws of 
physics. It is fair to say that Information Physics dates back to the original work of Jaynes 



[2II [30! on classical and quantum thermodynamics. There it was shown that the main laws of the 
latter could be derived from maximization of Shannon entropy subject to appropriate constraints 
expressing the macrostate parameters. These results were later extended to derivations of classical 
311 ] and quantum 32j mechanics main laws. Recently, progress also has been made in obtaining 



main equations of relativistic quantum theory {331 ] . The central hypothesis of Information Physics 
is that the fundamental physics laws are indeed the laws of inductive inference applied to the 
description of respective systems. The main emphasis in discovering fundamental laws thereby 
shifts to that of determining the correct degrees of freedom and the relevant information necessary 
for the description of the system state. From the point of view of main information attributes, 
it can be said that, while the classical Information Theory's main concern is with information 
quantity, Information Physics' focus is on information relevance. 

A somewhat different direction within the field of Information Physics exploits the consequences 



of theory of partially ordered sets (posets). This direction goes back to the work of Cox 



34 



3a ] on 



foundations of probability considered as a way to consistently descri 
conscious agent may possess. More recently, it has been shown in 37 



)e incomplete information a 



391 ] that while probability is 



a natural (bi-)valuation on the lattice of logical assertions about system states, Shannon entropy 
(which is the main tool of information quantification in classical Information Theory) is a natural 
(bi-)valuation on the corresponding lattice of questions. It has been argued that order may be one 
of the most fundamental concepts of science and it has been demonstrated {^(J that, for instance, 
Lorentz transformations and Minkowski metric of special relativity can be derived directly from 
order-theoretic considerations applied to events in space-time. 



B. Outline 

The rest of the article is organized as follows. In Section [Til we briefly discuss the necessary pre- 
liminaries. In Section [TTIl the main primitives of the proposed framework - questions and answers 
- are defined. In Section ITVl the overall form of the answer depth functional is derived from a set 
of plausible postulates that express, in particular, the isotropy property of the source's knowledge 
structure. Section [V] describes relationships between question difficulty and answer depth for main 
types of possible questions. In Section IVIl a special class of answers - the quasi-perfect answers - 
is discussed. Section IVIII is devoted to relationships between different questions and, in particular, 
the relative depth of an answer to one question with respect to another question is introduced. 
Section IVIIII introduces the notion of a source model and proposes several simple models char- 
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acterized by a well-defined source capacity. Section [IX] discusses optimization-based methods for 
estimating both the source knowledge structure and source model parameters. Section |X] gives 
simple numerical examples illustrating concepts and results discussed earlier in the article. Finally, 
Section IXj gives a short summary and a discussion of main results. 

II. PRELIMINARIES 

The necessary preliminary facts and definitions were already discussed in the companion paper 
0j. We briefly recap it here for so that the present article can be read independently. If uncertainty 
is present in a decision making problem, it can de described as a certain base space ft (equipped 
with a suitable sigma-algebra 3") that contains all possible sets of input data for the problem. The 
problem itself can be formulated as an optimization with respect to a suitably chosen criterion. 

When uncertainty is present, a notion of loss can usually be defined. It measures the performance 
of a solution obtained in the presence of uncertainty with respect to that of a solution that would 
have been obtained had the decision maker possessed the full information. The overall goal of the 
agent can be formulated as that of minimizing the expected loss. To achieve that goal, the agent 
can turn to an information source for additional information. 

We call a collection of (distinct) subsets C = {C\, . . . , C r } inclusion-free if for any Cj, Cj G C 
neither of the two is a proper subset of the other. A collection of subsets C = {Ci, . . . ,C r } is 
called complete if L)£ =1 Cj = ft. 

A complete partition C = {C\, . . . , C r } of ft is a collection of (measurable) subsets Cj G 3~ of ft 
such that Cj f] C\ = for j ^ I and L) T j =1 Cj = ft. A partition C is a refinement of C if every set 
from C is a subset of some set from C. In such a case, C is a coarsening of C. 

If C = {C[, . . . , C r } and C" = {C{, ... , CJ} are two partitions of ft then the partition C = 
C n C" is defined as the partition that consists of all sets of the form C[ n CJ: C n C" = 
{C[ n C{, C[ nCJ,...,C r n CJ} (see Fig. H for an illustration). Clearly, C n C" is a refinement of 
both C and C". 

If D C ft is a subset of ft and C = {C[, . . . ,C r } is a partition of ft, the partition C' D = 
{D n C[, . . . , D n C r } of D will be called the partition of D induced by the the partition C of ft 
(see Fig. 

Besides complete partitions of ft, we also make use of incomplete partitions C = {Ci, . . . , C r } 
such that L>l =1 Ci / ft. For any partition C, we use the notation C = U[ =1 Cj. Clearly, partition C 
is complete if and only if C = ft. 
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FIG. 2. Two (complete) partitions of fi and the corresponding joint partition. 
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FIG. 3. Partition Co of set D C ft induced by a partition C of il. 

For an arbitrary complete partition C = {C%, . . . , C r }, the measure P can be written as a linear 
combination of conditional measures as 



P = ^P(C J )P Cj , 

where a conditional measure Pc is defined, for any subset C of f2 such that P(C) > 0, by 

P(DnC) 



(1) 



P(C) 



(2) 



for arbitrary D 6 J. 



III. INFORMATION ACQUISITION PRIMITIVES: QUESTIONS AND ANSWERS 



As was stated earlier, the basic information acquisition process involves the agent asking ques- 
tions of an information source 4l| and the source providing answers. Here we recall the definition 



of questions discussed in [2|] and provide a definition of answers to be used later in this article. 



A. Questions 



A question was originally denned in 



361 ] as a set of logical assertions that answer it. A somewhat 



different notion of a question was proposed in 



421 ] where questions were identified with probability 



distributions that are interpreted as requests for missing information. The direction suggested in 
361 ] was further pursued in 



37 



39l | where a distributive lattice of questions defined as down-sets 



of (sets of) logical assertions was described. Our definition of questions described in detail in the 
companion paper [2] combines features of those proposed in 



36 



371 ] on one hand and in 



on the 



other. Specifically, questions are associated with inclusion-free collections of subsets of the base 
set fi which, as mentioned in the previous section, comes "equipped" with a probability measure 
P describing the "initial state of information" available to the agent. Depending on the nature 
of the particular collection of subsets, several different types of questions can result. Thus, for a 
single subset, one obtains an ideal question of [37j, a complete partition corresponds to a partition 
question, and a finest partition (in case it exits) yields the central issue, i.e. the most detailed 
partition question. Furthermore, a complete collection of subsets of f2 (regardless of it being a 



361 ] and [37J. These are the questions 



partition) corresponds to a real question in terminology of 
that lie above the central issue in the distributive lattice of questions. The questions of interest to 
us at this point 



431 ] will be those corresponding to partitions of f2 - both complete and incomplete. 
We will refer to them as complete and incomplete questions, respectively. Thus a question in what 
follows is identified with a partition C = {C\, C2, ■ ■ ■ , C r } of £1. The questions for which C = 0, 
are called complete questions. In particular, incomplete questions for which the corresponding 
partition consists of a single set Cc!l will often be called, following [37j], ideal questions. We will 
also use the terms "question" and "partition" interchangeably. 

A difficulty functional G(Cl, C,P) can be associated with any question C. The particular form 
of G(£l,C, P) can be determined if some requirements, which can be formulated as postulates, 
are imposed. This was done in the companion paper [2^ where a particular system of postulates 
that embodied linearity and isotropy properties of the source's knowledge structure and, hence, the 
difficulty functional, was proposed. The main theorem proved in Q] derives the general form of the 
difficulty functional that is required to satisfy such postulates. 



Theorem 1 Let the functional C, P) where C = {C\, . . . , C r } satisfy Postulates 1 through 6 



(see Then it has the form 



G(0,C,P) 



where u(Cj) = — j p^Q.j and u: ft — >• R is an integrable nonnegative function on the parameter 

space O. 

In particular, for the given question C, its difficulty depends, besides the initial probability 
measure P, on the (integrable) function u(-) defined on the parameter space f2. This function was 
called the pseudotemperature in 2] using parallels with thermodynamics (see [4] for more details). 
The question difficulty then can be interpreted as the amount of pseudoenergy associated with 
question C. 

B. Answers 

Given a question C, a source is assumed to be capable of providing an answer. Our definition 
of an answer differs somewhat from that proposed in 36[ in that it aims to account for different 



degrees of accuracy. Also, as opposed to the definition of Cox from [36(, according to which an 
answer is a logical assertion that answers no less than the question asked, we take special care to 
make sure that no more than the question is being answered. 

Since any information is represented by some measure on f2, it is reasonable to think of an answer 
to question C as a message a reception of which implies certain changes in the initial measure P. 
In an extreme case, a message can change the original measure to a measure supported at a single 
element of - this describes a complete resolution of the initial uncertainty and the best possible 
answer to the exhaustive question which is the central issue of [3^]. Taking these considerations 
into account, and assuming, without loss of generality, that P(Cj) > for all subsets Cj in C, we 
adopt the following definition. 

Definition: An answer to the question C = {C\, . . . ,C r } is a message V(C) that takes values 
in the set {s%, S2, ■ ■ ■ , s m } such that a reception of the value Sk of the message updates the initial 
measure P on Q to the measure P k such that either P k (Cj) = or Pq. = Pcj for all k = 1, ... m 
and all j = 1, . . . , r. 

The meaning of the condition Pq = Pcj in this definition is that an answer to question C should 
not resolve more uncertainty than what the question C was requesting: a valid answer V(C) does 
not change the relative probabilities inside subsets Cj , but only the probabilities between the subsets 
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Cj constituting the question. It is straightforward to show for V(C) to be an answer to a complete 
question C according to this definition, it is necessary and sufficient for the updated measures P k , 
k = 1, . . . , m, to take the form 

r 

P k = Y,P^, (3) 

i=i 

where pkj, k = l,...,m, j = l,...,r are nonnegative coefficients such that Yfj^iPkj = 1 f° r 

k = 1, ... ,171. 

If question C is complete and V(C) is a corresponding answer, we assume that V(C) does not 
change the original measure P on average, or, formally speaking, 

m 

Y J ^{V{C) = s k )P k = P, (4) 

k=l 

from which it follows, in particular, that if the answer is perfect, then Pr(V*(C) = Sj) = P(Cj). 
We refer to @ as the consistency with prior condition for the answer V(C). 

In the following, we denote the probability Pr(V(C) = s k ), for any answer V(C) to a complete 
question C - by v k , for brevity. It is straightforward to show that it follows from the consistency 
with prior condition Q that 

m 

^2v k p kj = P{Cj), j = l,...,r. 

k=l 

Incomplete questions, including ideal questions, are interpreted, as explained in [2], as "aspects" 
of complete questions conditioned on the corresponding subsets of Q being true statements. For 
example, if the complete question is "Is the fruit an apple, a pear or a peach?" then the ideal 
question corresponding to the element (subset) "Apple" of the base space can be understood as the 
complete question provided the fruit is really an a ppl e. Thus ideal and, more generally, incomplete, 
questions are never formulated and posed as such [44| in real inquiry situations. On the other hand, 
when a real question is posed, it is always true that an ideal question is implicitly asked. Often, 
though, neither the agent nor the source (unless the source is capable of providing perfect answers) 
actually know which ideal question is being asked. The reason is that, simply, the knowledge of 
the correct answer 2] to the given question is needed for that. 

Let C = {Cij . . . ,C r } be a complete question and let Cj 6 O be one of its "constituent" 
ideal questions. Suppose V(C) is some answer to C. We denote by q4 the probability that the 
corresponding answer to Cj takes the value s k - Then one can apply the Bayes' rule to obtain that 

q f = Pr(F(C) = s k \oj G Cj) = ( 5 ) 
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In particular, if the answer V(C) is perfect, then it follows from ([5]) (since p k j = 5 k j and Vj = P(Cj)) 
that = 5kj, i-e. every ideal question Cj receives just a single answer - equal to its correct answer. 

To consider a more general incomplete question, assume, without loss of generality, that such a 
question has the form C = {C\, . . . ,C[} where I < r. If, just like above, V(C) is some answer to 
the complete question C = {C%, . . . , C r }, the probabilities q k of different values of the "induced" 
answer to C can be found by conditioning on C = U l j =1 Cj: 

q k = Pr(V(C) = s k \uj EC) = v k —^. (6) 

In the following, we will sometimes refer to incomplete (including ideal) questions and answers 
to them without a simultaneous explicit reference to the corresponding complete (and thus real) 
question. It has to be kept in mind, however, that incomplete questions and corresponding answers 
are auxiliary constructs in the sense described above. 

While the functional G(Q, C, P) measures difficulty of questions, it would be desirable to develop 
a measure of the amount of difficulty in C that is resolved by the answer V(C). As was mentioned 
earlier, the question difficulty can be interpreted as the amount of pseudoenergy associated with 
the question. It is reasonable to expect the amount of pseudoenergy contained in a perfect answer 
to be equal to that in the question itself and, respectively, the amount of pseudoenergy in any 
other answer to contain somewhat less pseudoenergy - as long as it is an answer to C and not 
some other question. 

In the following we denote the amount of pseudoenergy contained in the answer V(C) - the 
depth of V(C) - by Y(Q, C, P, V(C)) to emphasize its dependence on Q and the initial measure P. 



IV. ANSWER DEPTH FUNCTIONAL 



In this section, our goal is to derive the general form of the answer depth functional by imposing 
certain plausible requirements it has to satisfy. These requirements that we call Postulates are 
similar to those stated in Postulates Ql through Q6 for questions (see [3]). 

Since information in V(C) is conveyed by means of modifying the original measure P and the 
latter is modified differently for each value of the message V(C) the depth functional for the answer 
V(C) should be equal to the expected value over possible values of the message V(C): 

m 

Y(n, C,P, V(C)) = J2 Pr(F(C) = k)Y(n, C, P, P k ), (7) 

k=l 

where P k is the measure modified by the reception of V(C) = k and Y(il, C, P, P k ) is the condi- 
tional depth that depends on the modified measure P k . 



12 



We now impose some reasonable requirements on conditional depth functionals Y(Q, C, P, P k ) 
which are formulated as postulates as in [jj. 

The first such requirement is that the conditional depth should vanish if the measure is not 
modified at all, i.e. if P k = P. On the other hand, if the modified measure assigns larger proba- 
bilities to all subsets in C (which can happen only for incomplete questions), then the conditional 
depth should be strictly positive. This is the content of Postulate Al. 

Postulate Al (Correct direction). Let C = {C\, . . . , C r } be any question. Then Y(Q, C, P, P k ) = 
if P k (Cj) = P(Cj) for all j = 1, . . . r and Y(Sl, C, P, P k ) > if P k (Cj) > P(Cj) for all j = 1, . . . r. 

The second part of the postulate says that, for an ideal question, if, for instance, upon reception 
of the value Sk of V(C) the set C has a higher probability than before then the value k has a positive 
amount of pseudoenergy. For example, if the original question was "What kind of fruit is it?" with 
"Pear" being the correct answer then in case the answer sounds like "It looks a lot like a pear" or 
"It's either a pear or an apple" then such answer is assigned positive pseudoenergy as it moves "in 
the right direction" towards the correct answer. 

The next postulate parallels Postulate Q2 for questions (see Q]). 

Postulate A2 (Continuity). The function Y(Q,,C, P, P k ) is continuous in all parameters it 
may depend upon. 

The next postulate follows from the requirement that if V(C) is an answer to question C then 
the depth of V(C) cannot exceed the difficulty of C. This property is easiest to state for ideal 
questions Ccfl. 

Postulate A3 (Ideal complete answer). Let C be an ideal question and suppose P k (C) = 1. 
Then 

Y(n,C,P,P k ) = G(n,c,p). 

This postulate expresses a simple desideratum that an exhaustive correct answer to a question 
should convey exactly the amount of information requested by the question. For instance, if the 
question is "What fruit is it?" with "Apple" as a correct answer then the answer "Apple" should 
carry all the information the question was asking for. 

The next three postulates parallel Postulates Q3 through Q5 for questions. 

Postulate A4 (Incomplete question answer decomposition) Let C = {Ci, . . . , C r } be an incom- 
plete question. Then 

Y(n, c, p, p k ) = Y(n, c, p, p k ) + y(c, c, p d , p k ). 
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Postulate A5 (Mean value). Let C and C be two incomplete questions such that CD C 
Then 

Y(n, c u c, p, p k ) = pk ^) Y ^ c ^ p > pk ) + P k (c r )Y(n, a, p, p k ) 



P k (cuc) 

Just like we did for questions, we can say that the subset D of f2 is homogeneous iff the 
conditional depth functional depends only on measures of partition C whenever C C D, i.e. 
Y(D, C, Pd,Pd) = /(-Pd(C), Pp(C)). In particular, any atom (minimal set) of 3" is homogeneous. 

Postulate A6 (Homogeneous ideal sequentiality) . Let D C Q be a homogeneous subset of the 
parameter space and let C be a question such that C C D. Then 

Y(n, C, P,P k ) = Y(Q, D, P, P k ) + Y(D, c, p d ,p£>). 

We can now state the main result about the possible shape of answer conditional depth func- 
tional F(0, C, P, P k ). It is formulated theorem. 

Theorem 2 Let Postulates Al through A 6 hold. Then the conditional answer depth functional 
Y (fi, C, P, P k ) has the following form 

f c u{ui)dP k {ui) 

where u(Cj) = — j P k^ c .^ an d the integrable function u: VL — > R is the same that is used in 

characterizing the question difficulty functional G(-). 

n 

Proof: The proof is similar to that of main theorem in [2(. Let A = {A\, . . . , Am} be a 
complete and sufficiently fine partition of £1. We can assume, without loss of generality, that the 
sigma-algebra 3" on Q is comprised of all unions of sets in A. Note, in particular, that all subsets 
in A are homogeneous. 

Let D be a homogeneous subset of Q and let C C C C D be two subsets of D. Then, by 
Postulate A6, 

Y(Q, C, P,P k ) = Y(Q, D, P, P k ) + Y(D, C, P D ,P&), (8) 

and 

Y(D, C, P D , P k D ) = Y(D, C, P D , P k D ) + Y(C, C, P C , P$). (9) 
Since D is homogeneous it follows from ([9]) that 

f(P D (C'),P k (C')) = f(P D (C),P k (C)) + f(P D (C')/P D (C),P k D (C')/P k D (C)). 
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Then standard arguments using Postulates Al and A2 (see [241 ] for details) lead to the conclusion 
that the function /(•) has the form 

f(p,q) = clog-, 
P 

where c is a positive constant. Going back to Y we obtain 

Y(D, C, P D , P k ) = u'(D) log (10) 

where u'{D) > is a constant that can possibly depend on the particular homogeneous subset D. 
Substituting (fTUj) into (JS]) we arrive at 

pk 

Y{Q, C, P, P k ) - Y(Q, D, P, P k ) = Y(D, C, P D ,P k D ) = u'(D) log 

= »<( )log^ -.<(£) log 

from which it follows (using continuity of Y and the fact that the subset C C D is arbitrary) that 

F(0, C, P, P k ) = u'iD) log — ^ + AD), 

for any C C D whenever D is homogeneous. Here v'(D) is another constant that can possibly 
depend on the homogeneous subset D. We can now use Postulate A3 to conclude that u'(D) = u(D) 
for all homogeneous sets D and that v(D') = 0. This leads to the following expression for the 
conditional depth functional of an answer to an ideal question lying inside a homogenous subset: 

Y(n,C,P,P k )=u(D)log^l. (11) 

Now let D = {Di, . . . , Dn} be a (complete) partition of 0, such that every subset in D is 
homogeneous. Let C C fi be an ideal question. Using Postulate A5, we can write 

, Eli u(Dj)P k (C n Dj) log P Z ( cnS j ) 

Y(n, n c ,p, P k ) = — - — p^L i (12) 



and 



j 

An application of Postulate A4 now yields 



P k {Cr\Dj)^ P k (CnD j )/P k {C) 

W 



Yi C, Dc, Pcj ^a) - E <D 3 f-^ iog 



(13) 



Y(Q, C, P, P k ) = Y(Q, D C , P, P k ) - Y(C, D C , P C , P k ) 

^ ^ p^cnDj) P k (c) P k (c) 

= g »(D S ) pk{c) log = .(C) log ^ 
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where 



_ P k {c n Dj)u{Dj) _ f c u ( u )dP k ( WJ 
[ P k {c) ~ P k (c) • 1 ' 



Here, the function u: SI — > R is defined as 

N 

M ( W ) = ^2 u ( D j) I D J (u), 
3=1 

and therefore is the same exact function that was used to describe the question difficulty func- 
tional G. 

Finally, let C = {C\, . . . ,C r } be an arbitrary question on f2. An application of Postulate A5 
yields 

k E'=iM^)p fe (c,)iog^ 

Y(Q,C,P,P k ) = -> — ^ — ^ P{C ^ , (15) 

where u(Cj) is given by (j!4|) . □ 

Having found the expression for conditional depth functional we can now use it to obtain the 
unconditional (expected) answer depth Y(fl, C, P, V(C)). We formulate the result as a corollary. 



Corollary 1 The answer depth functional Y(Q, C, P, V(C)) has the form 

™ E;=i«(^)P fc (^)io g ^ 

Y(Q,C,P,V(C)) = £Pr(V(C) = s k )^ 3 r P(Cj) , 

fc=i 2^=i^ IWJ 

where P k is the measure on Q updates by a reception of V(C) = s k and u(Cj) is defined in 
Theorem 0. 



V. RELATIONSHIPS BETWEEN DIFFICULTY AND DEPTH 

Theorems [1] and [2] (together with Corollary [1]) establish the overall form that question difficulty 
and answer depth, respectively, can take. The conditional depth functional Y(Q, C,P, P k ) depends, 
besides the original measure P, on the updated measure P k . 
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A. Complete questions 

Let us assume the consistency with prior condition holds and consider the answer depth 
functional given by Corollary [TJ Since for a complete question X^=i P k (^j) = 1, we can write 

m r ^ k 



Y(n, c, p, v(c)) = E^*E < c 3 )P k {c 3 ) log ^ 

k=l j=l ^ i 

m r m r 



^^^n^OP^C^logP^^O-^^^n^OP^C.OlogP^ 

k=l j=l k=l j=l 



m r 

( = ) ^^^KC i )P fe (C' j )logP fc (C' i ) + G(0,C,P) < G(Sl,C,P), 

k=l j=l 

where (a) follows from @ and Theorem [TJ and (b) follows from the inequality log P k {Cj) < 0. It 
is also clear that the inequality (b) becomes an equality if and only if, for every value s k of the 
answer message, either P k {C 3 ) = or log P k {Cj) = for every value of the index j. For the latter 
to be true it is necessary and sufficient that, for all values of k, 

P'iCj) = S fWJl (16) 

where /: {1, 2, . . . , to} — > {1, 2, . . . , r} is a map from the set of possible values of index k to that 
of index j. Substituting (fTU|) into (HJ) we obtain 

m 

P(Cj) = J2 Vk6 m,3 = E Vk - ( 17 ) 

k=l k:f(k)=j 

It is easy to see that without loss of generality one can define an equivalent message V'(C) such 
that V'(C) = Sj whenever V(C) = s k such that f(k) = j. Then ((TTJ) becomes simply 

P(C j )=Pr(y'(C) = s j ). (18) 

Recall that a perfect answer to a complete question C = {C\, . . . , C r } is defined as a message 
V*(C) = {s\,...,s r } such that P k (Cj) = 5 k j, and, as a consequence, Pr(V(C) = Sj) = P(Cj). 
Then we can state the result obtained above as a lemma. 



Lemma 1 Let C be a complete question and assume the condition for any answer V(C) to 
C to hold. Then Y(D,, C, P, V(C)) < G(£l,C, P) with the inequality being tight if and only if the 
answer V(C) is perfect (up to trivial equivalences). 
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B. Ideal and other incomplete questions 

Let C C be an ideal question. We can write the depth functional for a corresponding answer 
(denoting by q k the probability that V(C) = Sf.) V(C) as follows. 

Y(n, C, P, V(C)) = lMC)P k {C) log T U. 

m m 

= u(C) Qk log P k (C) - u{C) log P{C) Pr(^(C) = s k ) 

k=l k=l 

™ (a) 

= u(C)Y,QklogP k (C) + G(n,C,P) < G(n,C,P), 

k=l 

where (a) follows from that the inequality log P k {C) < 0. It is straightforward to see that for the 
inequality (a) to become an equality it is necessary and sufficient that P k (C) = 1 for all values k 
of the answer message. Clearly, in that case, we can define an equivalent message V'(C) that takes 
a single value s so that P S (C) = 1. 

It appears reasonable to define a perfect answer V*(C) to an ideal question C as a message 
taking a single value s such that P S (C) = 1. Note that while such a definition may sound a bit 
strange (since the answer takes a single value), it makes good sense if one keeps in mind that ideal 
question are auxiliary constructions and any answer to an ideal question should really be thought 
of a "part" of an answer to come complete question. We can again state the result obtained above 
as a lemma. 



Lemma 2 Let C be an ideal question and V{C) an answer to it. Then Y(Jl,C,P,V(C)) < 
G(p,, C, P) with the inequality being tight if and only if the answer V(C) is perfect. 

Finally, let C = {C%, . . . , C r } be a complete question and C = {C%, . . . , C{\ where I < r be an 
incomplete question. We define a perfect answer V*(C) to an incomplete question as a message 
taking values in the set {s\, . . . ,si} such that P 3 (Cj) = 1 for j = 1, . . . , /. As usual, C' = Uj=±Cj. 
Then it is straightforward to prove a result analogous to that of Lemmas CD and EJ 



Lemma 3 Suppose C is an incomplete question and V(C') is an answer to it. Then Y(Q, C, P, V(C')) < 
G(Q, C , P) with the inequality being tight if and only if the answer V(C') is perfect. 
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Proof: We can write the depth functional for V(C') as follows. 

m 

Y(n,c',p,v(C)) = J24k- 



k=l 



P k (C) 



iEE vwWFUPi) log P fe (C 

' fc=l j=l 



m I 



-I- E £ VkuV^iCj) log P(Cj) 

) k=i j=i 



m I 



1 1 

^^^^i^^logP^C^^J^jPtC^logF^) 

1 m r (c) 

= ^7?^EE^(^) pfc (^) lo g pfc (^) + G '(^ c '' p ) ^ G (^ c '> p )> 
) fc =i j=i 

where (a) follows from (|6|), (b) follows from the consistency with prior condition for the complete 
question C, and (c) follows from the inequality log P k (Cj) < 0. Using the same arguments as those 
employed for the proof of Lemma [T] we arrive at the statement of this lemma. □ 

One can summarize the main result of this section by saying that, for any question type, the 
depth of any corresponding answer cannot exceed the difficulty of the question. Moreover, the 
answer depth can only be equal to the question difficulty in case the answer is perfect, i.e. the 
answer fully resolves the uncertainty associated with the question and does so with certainty. 
The number of different values of a perfect answer is always equal to the number of subsets in 
the question. While incomplete questions - including ideal questions - are just useful auxiliary 
constructs, the same basic property holds for them as well, at least for the isotropic knowledge 
structure model considered in the present article. 



VI. QUASI-PERFECT ANSWERS 

Let the question C = {C±, . . . , C r } be complete and let V(C) be an answer to C. If V(C) is 
perfect, its depth Y(Q,C,P,V(C)) is equal to the difficulty G(il,C, P) of C as Lemma [1] states. 
Here we would like to consider some simple classes of imperfect answers. To make the form of an 
imperfect answer more specific let us assume such as answer to resemble a perfect one in that the 
number of possible values it can take is equal to r and each message s^, k = 1, . . . ,r expresses 
a degree of preference towards the subset C^. Let be the error probability associated with s&, 
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i.e e k = P k (C k ), where C k = O \ C^. Let us also make the additional assumption that the error 
associated with s k is "proportionally distributed" between sets Cj j ^ k, i.e. P k (Cj) - ' 7 " ' ' — 



i _ p (C^) • Obviously, both of these assumptions can be stated in the following way. 



implying that the coefficients pfcj in ([3|) have the form 

Pkj = I 1 " T^pW) ) ^ + T^pW) (19) 

To further simplify the analysis and provide more concise description of errors associated with 
imperfect answers we make a further assumption: that the error probability e k constitutes the 
same fraction of P(C k ) f° r an values of k, i.e. et = a(l — P(Cfe)), k = 1, . . . , r, where < a < 1. 
Under this assumption, the error associated with the answer V(C) that we will denote by V a (C) 
is fully described by a single parameter a. The coefficients pkj in (fT9j) become 

p kj = (l-a)6 kj + aP(Cj), (20) 

and the updated measure P k becomes simply 

P k = aP + (1 - a)P Ck . (21) 

We see that, for a = 0, measure P k turns into the conditional measure Pc k making the answer 
perfect, and for a = 1 each measure P k becomes the original measure P thus rendering the answer 
V a (C) empty, i.e. possessing vanishing depth. 

Substituting (|2ip into the general expression for the answer depth and using the fact that in 
this case = P(Ck), k = 1, . . . , r, we can obtain 



Y(n, C, P, V a (C)) = £ u(C k )P(C k )(l -a + aP(C k )) log 1 ^ffi^ 



fe=i 



toga j>(C fc )P(C fc )(l-P(C fc )), 



(22) 



a 

fc=l 



It is easy to see that the expression (j22[) becomes C, P) for a = and vanishes for a = 1. 

In the following we will call answers characterized by updated measures of the form f|21|) and 
depth functionals given by (|22|) the quasi-perfect answers. Their advantage is that they allow to 
smoothly interpolate between perfect and empty answers using just a single parameter a taking 
values on the interval [0, 1]. 
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Substituting (|20j) into the consistency condition ()U it is easy to see that for quasi-perfect answers 

vj = P(Cj), (23) 
for j = 1, . . . , r, regardless of the value of error probability a. 

VII. RELATIONSHIPS BETWEEN QUESTIONS AND ANSWERS 

Given two complete questions C and C" the pseudoenergy overlap J(f2, (C; C"), P) was defined 
in [2I as 

(c' ; c"), p) = G(n, a, p) + G(n, c", p) - G(n, c n c", p), (24) 

which can easily be seen to have the following form 

J(fi, (C; C"), P) =EE u(C> n C^)P(CS n q) log p^^ly (25) 

It was also shown that the pseudoenergy overlap can be interpreted as the reduction of difficulty 
of question C" due to the knowledge of a perfect answer V*(C) to question C 

G(n, c", v*(C)) = G(n, c", p) - j(n, (C; c"),p), (26) 

where the conditional difficulty G(ft, C", V(C')) is defined (for any answer ^(C') to question C') 
as 

ml 

G(Sl, C", V(C')) = Pr(^(C) = s k )G(n, C" , P' k ). (27) 

k=l 

It would be interesting to find out how the relation (|26p generalizes for the case of an arbitrary 
answer to question C'. Clearly, since a reception of value s' k of V(C') updates the measure P to 
P' k , the difficulty of C" given V(C') = s' k is equal to 

r" 



G(n, C", P' k ) = -J2 u(C")P' k {C") log P' k {C'J) 



i=i 



= - EE < c 'i n CJ)P' k (ci n c'!) log p' fc (c;), 

and therefore the overall (expected) difficulty G(Q, C", V{C)) of question C" given an answer 
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V(C') to C can be written - denoting Pr(V(C) = s' k ) by v' k - as 

m! m! r" r' 

G(o,c",y(C)) = X>^o(o,c",p /& ) = -^^^^n(c;nc;)p' fc (Q'nc;)io g p' fc (c;') 

fc=l fc=l j=l 1=1 

m! I r" r' 

= E < E E "(cf n Ci)^*^ n o») log p' fc (c-) 

fc=i y=i i=i 

+ E E < c 'i n cjO^(Q' n c'!) log p(cj') -EE "(^ n c'i)P ,k {c[ n ej') log p(cj') 
j=i j=i i=i ;=i 



P (C h 

E 4 E E n c^j^cqf n c») log — U 
fe=i j=i i=i ^ i > 



E E "(c? n n cj) bg p(oj') 

m' r" r' P ,I: (C") 

: o(n, c, p)-E^EE "(c? n n lo § ■ 3 



k =l i=H=l p W 

(28) 

We see from (|28p that the conditional difficulty of C" can be represented as a difference of the 
standard (unconditional) difficulty and another expression that can be appropriately denoted 
Y(Q, C", P, V(C')) and called the relative depth of the answer V(C') with respect to question 
C": 

G(0 ; C", F(C')) = 0(0, C", P) - F(Q, C", P, V(C')), (29) 
where the relative depth y(0, C", P, V(C')) is given by 

ml r" r' P' k (C") 

Y(n, c", p, v(C)) = E ^ E E < c i n c])P' k (c[ n c$0 log —^1. (so) 

fc=l 3=1 1=1 \ j I 

Using the expression ([3]) for the updated measures P' k we find that 

P' k (C' l nq ) = Pkl ^§^l ( 31 ) 



and 

pre/ n C7) 

^(qo=X> V(c?) ' (32) 

and, substituting ([3T]) and ([32]) into ([30]) we obtain for the relative depth: 

P(n'. n n'f\ 

(33) 



r(o, c", p, V(C0) = E 4 E E n c») P ' kl • P{C J^ ] log £ z4 • 



We can summarize the result just obtained as a lemma. 
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Lemma 4 Let C and C" be two arbitrary complete questions on Q and let V(C) be an answer 
to C Then the conditional difficulty of C" given the answer V(C') can be found as 

G(n, c", v(C)) = G(n, c", p) - Y(n, c", p, v(C)), 

where the relative depth ofV(C) is given by the expression I133\) . 

Suppose now that V*(C) is a perfect answer to C which implies that m' = r' and p' k , = 
5ki- Substituting this into (|33p and performing the sum over k while making use of the answer 
consistency condition (j3]) we obtain 

p(cinCJ) 



Y(n, c", p v*(C)) = ^ E < c 'i n c ") p ( c 'i n c ") lo e 



P(Cl)P(C'!) ' 



(34) 



1=1 3=1 v l ' v J 

which coincides with the expression (j25[) for the pseudoenergy overlap between questions C and 
C". We thus recover the result (i26j) obtained in 2j]. 

Let now V^(C') be a quasi-perfect answer to question C' characterized by error probability a. 
Substituting expressions (|20p and (|23p into (|33|) we obtain, after some straightforward algebra 



Y(Q, C", P, V a (C)) = (1 - a) E n (^' n n C ") lo S 



p(qf n c'! 



P{C[)P{C") 



+ a E E n c " n c?) E lQ g 

J=l 3=1 



fc=l 



P(Ci n c'f) 

1 ) P(C' k )P(C') + 



(35) 



It is easy to see that for a = (135j) reduces to (|34p which is the overlap between questions C and 
C", and for C" coinciding with C the relative depth ([3SJ becomes the depth Y(n,C , P,V a (C')) 
(given by expression (|22[) ) of quasi-perfect answer to C characterized by the same value of error 
probability a. To see that, it is sufficient to set C'J = C'j (and hence P{C[C\ C'J) = 5ijP{C[)) in 
(I35|) and make use of the (obvious) identity Y^k+j P ( C 'k) = 1 ~ P ( C 'j)- 



VIII. INFORMATION SOURCE MODELS 



The question difficulty functional describes the source knowledge structure by specifying the 
amount of pseudoenergy associated with any question C. The answer depth, on the other hand, 
quantifies the pseudoenergy associated with any answer of the source to question C, with more 
accurate answers carrying larger amounts of pseudoenergy. The next logical question is how ac- 
curately can the given information source answer the specific question C. This question can be 
restated by asking what value of answer depth the source is able to provide in response to C. It 
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would be very natural to assume that this depth should be a function of the difficulty of question 
C. This assumption essentially implies that the corresponding question difficulty faithfully charac- 
terizes the source knowledge structure. Finding this assumption to be wrong can be looked upon 
as an indication of the agent's failure to identify some essential features of the source knowledge 
structure (like, for example, its anisotropy). We formulate this assumption as a hypothesis. 

Hypothesis SI. For the given information source and any question C, the corresponding 
answer depth is a function of the question difficulty: 

Y(n,C,P,V(C)) = h(G(Q,C,P)), 

where h: R + — > R + is a function of a single argument. 

Note that Hypothesis SI can be thought of as that of existence of some knowledge structure 
(quite possibly a fairly complicated one) for the given information source. Put slightly differently, 
it states that the set of all possible questions can be represented as a partially ordered set with 
respect to the source's ability to answer them accurately. If this is indeed true, symmetry and 
consistency considerations can be invoked to find the specific form of the question difficulty and 
answer depth functionals. 

The specific shape of function h(-) can be determined by experimentation: one would generally 
have to assume some reasonable overall shape and then use sample questions and the source's 
answers to estimate parameters. (The procedure is similar to that applied, for instance, to fitting 
regression models.) If a particular (simple) model for h(-) is found inadequate, more elaborate 
models can be employed, until a good stable fit can be established. We briefly discuss some 
possible models next. 

A. Possible source models 

The overall shape of function h(-) is, in general, arbitrary, but it would be reasonable to make 
some initial assumptions on the grounds of general experience. In particular, it would be sensible to 
assume that the function h(-) possesses one or more of the following properties: (i) non-decreasing; 
(ii) continuous; (iii) bounded from above. Property (i) simply implies that the source can produce 
at least as much depth when answering a more difficult question. Property (ii) means that if two 
questions are close in difficulty the source will produce answers of close depth. And property (iii) 
implies existence of a well-defined source capacity - the highest answer depth the source is capable 
of. Let us now take a look at some simple models satisfying these properties. 
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Simple capacity model 

In this model, the information source is characterized by a single parameter that the source 
(pseudoenergy) capacity which we denote by Y s . Under this model, the source can provide perfect 
answers to questions whose difficulty does not exceed Y s and, for questions with difficulty exceeding 
Y s , the error probabilities increase in such a way that the depth of the corresponding answer stays 
equal to Y s . Put slightly differently, the source provides answers whose depth is constant unless the 
question is too easy for the source in which case the depth of the answer is limited by the difficulty 
of the question. Formally speaking, the function h(x) for this model takes the following form. 



h(x) = < 



x if x < Y s 

(36) 

y s ifx>y s . 



Modified capacity models 

The main drawback of the simple capacity model described above is that the information source 
is postulated to provide perfect answers to questions whose difficulty is below the source's capacity. 
On the other hand, in many situations, it is reasonable to expect that a source will make some 
error answering even simple questions. The modified capacity models' goal is allow for finite 
error probabilities for answers to questions with difficulties below the source capacity. This model 
depends on more than one two parameter: besides the capacity Y s , there is also a parameter 
describing way function h(-) approaches its maximum value Y s . The simplest such models is the 
linear modified capacity model described by 

{bx if x < 
(37) 
Y 8 ifz>£. 

where b < 1 is the second parameter. Under this model, the source makes errors even on questions 
with difficulties below the capacity with error probabilities gradually increasing with question 
difficulties. Once the question difficulty exceeds the capacity of the source, the corresponding 
answer depth stays equal to the capacity Y s . 

The linear modified capacity model can be naturally generalized to a polynomial modified 
capacity model in which the function h(-) approaches its maximum value according to a polynomial 
law. To describe it, let p q (x) = ao + a±x + . . . + a q x q be an order q polynomial and let x* be the 
smallest positive root of the equation p q (x) — Y s = 0. Then the polynomial modified capacity model 
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has the form 



h(x) 



Pq(X) 



if x < x* 



(38) 



Y s if x > x*. 



Demanding that h(0) = and h(x) < x for all x > leads to ao = and < ai < 1. For q = 2, 
the polynomial modified capacity model (|38p reduces to the quadratic modified capacity model that 
has the form 



h(x) 



bx + cx 2 
Y, 



if x < x\ 
if x > x%, 



(39) 



4Y S ' X 2 



_b_ _ y/b?-4\c\Ys 

2\c\ 2\c\ 



where < b < 1 and (assuming c < 0) |c| < ( '" 

Another simple model that belongs to the class of modified capacity models is the exponential 
modified capacity model 



h{x) = Y s (l - e 



(40) 



that depends on two parameters: capacity Y s and < 9 < y- that controls the speed with which 
the function h{x) approaches its upper bound Y s . One of the advantages of the exponential model 
(|40p is that it's described by a single analytical function that allows to avoid binary variables in 
the corresponding estimation problem discussed in the next section. 



IX. ESTIMATION OF PSEUDOTEMPERATURE AND SOURCE MODEL 

PARAMETERS 

In this section, just like in rest of this article and 2|, we assume that the linear isotropic model 
of the source knowledge structure holds. First, let us note that both question difficulty and answer 
depth functionals are linear in u(uS) and therefore multiplying u{oj) by any constant would result 
in both difficulty and depth being multiplied by the same constant without changing any of the 
coefficients p^j, k = 1, . . . , m, j = 1, . . . , r and, therefore, answer error probabilities. This means 
that the function u{oo) is really defined up to a single multiplicative constant the choice of which is 
equivalent to a choice of units in which u(oj) (and the difficulty/depth functionals) are measured. 
We use two different conventions that turn out to be convenient. 

• The normalized u{uj) convention in which J*^ u(u)dP(oj) = 1 for every information source. 
This convention is convenient because if u{oj) = 1, the difficulty of question C reduces to 
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Shannon entropy of the distribution P(C) = (P(Ci), . . . ,P(C r )). In a sense, this allows for 
measuring pseudoenergy in units commensurate with standard information bits. 

• The unit source capacity convention in which the units of u(uj) are chosen in such a way that, 
for each information source, the source capacity (assuming it exists) is unity: Y s = 1. This 
convention is useful for comparing different information sources to each other. Indeed, in 
this case, functions u(u>) for any two sources can be directly compared to each other showing 
clearly the relative degree of "expertise" of each source in various regions of f2 and also giving 
a sense of "absolute" quality of each source. 

If the function u(u) is known Theorem [T] gives (for the given measure P) the difficulty of any 
question C. Then, for any answer V(C) to C, the knowledge of updated measures P k allows one 
to find the depth of V(C). On the other hand, a given source model Y = h(G) lets one predict 
the depth of the source's answer to any question before measures P k can be estimated. Thus in 
order to be able to predict the depth of source's answer to various questions one needs to know the 
function u(u) and the source model described by the function h(-). Since these functions cannot 
be directly measured or observed, the only way to know them in any application is to estimate 
them from the source's performance on a certain set of sample questions. 

Let D = {Di, . . . , DN d } be a partition of f2 that to be used for discretizing the weight function 
u{oj): we assume that u{oj) takes a constant value equal to U{ on subset D{. Let Wi = P(D{) 
and let Hj C {1, . . . , Nj} be index set of subsets in D that are immediate neighbors (i.e. have a 
common boundary with) of subset Di. We assume that the partition D is sufficiently fine so that 
any partition C used for estimating u{oS) can be considered a coarsening of D. 

Further, let Ci, . . . , Ck be a set of questions that the source has answered and its answers have 
been compared with actual outcomes in f2. Let us denote by G%, . . . , Gk be difficulties of these 
questions and let Y\, . . . ,Y% be the corresponding answer depth values that were computed using 



the estimated error probabilities 



Let us introduce the notation z\ = \Yi — h(Gi)\, i = 1, . . . , K where the function h(-) is given by 
the suitable information source model. The quantities measure the absolute values of deviations 
of the empirical data from the chosen source model, with vanishing values of all variables Z{ 
corresponding to a perfect fit. In addition to minimizing the sum of the deviations (i.e. maximizing 
the fit), it makes sense to demand that the quantities Uj, j = l,...,Na, describe a reasonably 
smooth function u(uS). This can be achieved, for instance, by putting an upper bound on the 
gradient of u(uS) or, equivalently, by putting a corresponding term in the objective function. To 
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make it more precise, let iV(D) be the set of neighbors in the partition D and let U be the desired 
upper bound on the difference of two values of u on neighboring sets of partition D. Then, if 
the capacity model h(-) is postulated, the following formulation of the estimation problem for the 
function u{ui) and the parameters of model h(-) is obtained. 



K 



minimize 



i=i 

subject to Yi — h(Gi) < Zi, i = 1, . . . , K 

(41) 

h{Gi) -Yi< Zi, i = l,...,K 
Uj -u k <U, (j, k) e N(D) 
u k - Uj < U, (j, k) G N(D) 

The decision variables in (|4ip are Zi, Uj, j = 1, . . . , and the parameters of function h(-). A is 
a parameter that controls the trade-off between the objective of maximizing the fit and that of 
maximizing smoothness of u{oj) (understood as minimizing the maximum gradient of u{oj)). The 
difficulties Gi, i = 1, . . . , K are expressed via the decision variables as follows. 

Gi=-J2[ E w)i°gp(c,-) ( 42 ) 

For the values of the depth functional for the corresponding answers, let us assume, for simplicity, 
that the answers are quasi-perfect implying that their errors can be characterized with a single 
probability aj, i = 1, . . . , K. Then the depth Yi can be written as 



Yl = £(1 - ai + afiCj)) log 1 ' "iZf™ 1 E 

I \ 

+ a { log ati ^2 P ( C j) 1 ~ E um ' 

i =1 \ {1-DiCCj} J 



UlWi 

{l-.DiCCj} 



(43) 



Note that, in general, (|4ip is a potentially complex nonlinear optimization problem where 
nonlinearity is introduced by the function h(-). For the case of the simple capacity model, the 
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problem (|4ip can be written as 



minimize 



K 

E- 

i=l 



xu 



subject to Yi — Y s < Zi + Myi, i = 1, . . . , K 
Y a -Yi<Zi + My u i = l,...,K 
Gi-Yi <z t + M(l- yi ), i = l,. 



K 



(44) 



uj -u k <U, (j, k) e N(p) 



u k - Uj < U, (j, k) € JV(D) 
yi G {0, 1}, i = 1, . . . , K 

In this formulation, M is a large number, yi, i = 1, . . . K , are auxiliary binary variables. The main 
decision variables in the formulation ()44p are the values Uj, j = 1, ... , iV^, and the capacity value 
Y s . Since both (|42p and f|43|) are linear in the variables tij, the optimization problem (|44p is mixed- 
linear with K binary variables and therefore can be solved efficiently at least for moderate values 
K of sample questions used for estimating model parameter Y s and the (discretized) function u(oj). 

The formulation (|44p can be modified easily from the simple to the modified capacity model. 
The resulting formulation is as follows. 

K 



minimize 



E- 

i=l 



XU 



subject to Yi — Y s < z% + Myi, i = 1, . . . , K 
Y s - Yi < Zi + My h i = l,...,K 
bGi-Yi<Zi + M(l-yi), i = 1, 



K 



(45) 



Uj -u k <U, (j, k) e N(D) 



u k - Uj < U, (j, k) G N(B) 

Vi e {o,i}, i = i,...,jf 

The additional decision variable in (j45|) is 6 < 1. The values Gi and 5^, i = 1, ... K are given 
by expressions (i4*2j) and (l4*3l) . respectively. The formulation (1431) is, just like (jHJ), is a mixed- 
linear optimization problem with K binary variables and thus can be solved efficiently at least for 
moderate values of the number K of sample questions. 

The formulation for the quadratic modified capacity model (|39p can be easily obtained from 
(|45l) by replacing the constraints bGi - Yi < %{ + M(l - yi ), % = 1, . . . , K with bGi + cGf - Yi < 
Zi + M(l — yi), i = 1,...,K. Recalling that Gi is a linear function of the decision variables U[, 
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we see that the resulting problem is that of quadratic optimization with K binary variables that 
enter the formulation in a linear fashion. Even thought such problems can't in general be solved as 
efficiently as mixed-linear optimization problems of equal size, they still can be solved to optimality 
for moderate values of parameters K and 

The modified exponential capacity model, as mentioned earlier, has the one advantage that the 
corresponding formulation of the estimation problem obviates the need for binary variables even 
though it becomes severely nonlinear: 

K 

minimize 

i=l 

subject to Yi - Y s (l - e~ 9Gt ) < z h i = l,...,K 

Ys (l - e~ 9Gi ) - Yi < zi, i = 1, . . . , K ( 46 ) 
Uj -u k <U, (j, k) G N(D) 
u k - Uj < U, (j, k) e N(D) 
y t £{0,l}, i = l,...,K 

Besides the quantities Zi, i = 1,...,K, ui, I = l,...,Nd and the source capacity Y s , another 
decision variable is the parameter < < 

It is worth noting that in estimation of the pseudotemperature function and model parameters, 
the error probabilities are themselves estimated values. That introduces obvious imprecision in 
estimation of pseudotemperature and source model parameters. In fact, one can think of the 
procedure described in this section as similar to point estimation of parameters in classical statistics. 
For more information about the pseudotemperature function, confidence intervals would be needed. 
The width of such confidence intervals would obviously depend on the precision with which error 
probabilities are known and therefore on the sample size used in error probability estimation. 
Practically, such confidence intervals may turn out to be sufficiently wide to effectively invalidate 
precise estimation of the shape of pseudotemperature function. The practical approach instead 
could be that of the hypothesis testing type: a null (default) hypothesis about the shape of the 
pseudotemperature function would be stated (i.e. that the pseudotemperature is constant or linear) 
and then tested using standard statistical methods. 

Just like in probability estimation, expert opinion can be used for estimating pseudotemperature 
function. Since pseudotemperature admits a simple intuitive interpretation (as local "degree of 
difficulty") experts should find it easy enough to give useful estimates of pseudotemperature. If, in 
addition, some data about observed source performance is available, it can be used in conjunction 
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with expert estimates by, for instance, using expert estimate as a null hypothesis and using observed 
data for the purpose of testing it. 

X. EXAMPLES 

Q 

Let us revisit the example with a finite parameter space from [21] . The parameter space f2 consists 
of 8 elements, corresponding to green, yellow and red apples (denoted GA, Y A and RA, respec- 
tively), green, yellow and red pears (denoted GPr, YPr and RPr), and yellow and red peaches 
(denoted YPc and RPc). The elements are equiprobable so that P(-) = g for all u E Q. The func- 
tion u(-) describes the relative difficulty of respective ideal questions. We set u(GA) = u(GPr) = 1, 
uiYPr) = u(RPr) = 1.5, and u(YA) = u(RA) = u(YPc) = u(RPc) = 2. Normalizing the values 
of u(-) so that $ n u(uj)dP{uj) = 1 we obtain u(GA) = u(GPr) = ^, u(YPr) = u(RPr) = ±§ and 
u{YA) = u{RA) = u{YPc) = u(RPc) = ±§. 

Consider, as in [2j, the question "Is the fruit green or not?". Let C g = {GA,GPr} C be the 
subset consisting of all green fruit (apples and pears) and let C g = Q. \ C g be the subset containing 
fruit of all other colors (red and yellow). The partition is C g = {C g ,C g }. The values u(-) for 
the sets in this partition are u(C g ) = and u(C g ) = | • j| + | • -j| = ||. The measures are 
P{C g ) = j and P{C g ) = |. The second similar question is "Is the fruit a peach or not?". The 
corresponding partition is Cp c = {Cp c , Cp c } where Cp c = {YPc, RPc} and Cp c = Q \ Cp c . The 
values of function u(-) on these subsets are u(Cp c ) = j| and u{Cp c ) = ^"^ + ^"y| + |"y| = y|- 
The measures are P(Cp c ) = | and P{Cp c ) = |. Let V a (C g ) and V a (Cp c ) be the corresponding 
quasi-perfect answers. The depth functionals of these answers can be computed using (|22p as 

Y(n, C g ,P, V a (C g )) = 1 (l - log(4 - 3a) + H (i - l a \ log ^ + ^a log a, 

and 

4/ 3 \ Q/ 1\ 4 — n- 21 

Y (O, C Pc , P, V a (C Pc )) = - I 1 - -aj log(4 - 3a) + - M - -a) log — + -a log a. 

Consider the question "What color is the given fruit?" on one hand and "What type is the given 
fruit?" on the other. The former question can be represented as the partition C c = {C g ,Cy,C r } 
where C g = {GA,GPr}, C y = {YA,YPr,YPc} and C r = {RA, RPr, RPc}; the latter question 
can be identified with the partition Ct = {CA,Cp r ,Cp c } where Ca = {GA, Y A, RA}, Cp r = 
{GPr,Y Pr, RPr} and Cp c = {YPc, RPc}. The values of u(-) on these subsets are u(C g ) = 

n,(C< \ — I 12 i 2 16 _ 44 \ _ 1(1 \ _ 44. \ _ 1 _8_J_2 16 _ 40 \ _ 1 _S_ , 2 12 _ 

"l^y^ — 3 '13 t 3 '13 — 39' a \ K - J g) — a V^y) — 39' a V^A) — 3 • 13 T 3 • 13 — 39, «^PrJ — 3 " 13 "r 3 " 13 — 
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§, u(C Pc ) = if. The measures are P(C g ) = \, P(C y ) = §, P(C r ) = §; P{C A ) = P(C Pr ) - s . 



P(C 



Pc 



3 r>m \ _ 3. -am \ — r>tn- \ _ 3 

8' 

j. Let V a (C c ) and V a (Ct) be quasi-perfect answers to questions C c and C*. The depth 



of these answers can be found using the expression (|22p . The results are (see Fig. |4|) 



Y(n,c c ,p,v a (c c )) 



2 

13 



and 



I - -a) log(4 - 3a) + 



11 
13 



a loe 



-5a 67 

1 a log a, 

3 104 6 



Y(n,C t ,P,V a (C t )) 



4 
13 



1 - -a ) log(4 - 3a) + — ( 1 - -a ) log 



-5a 69 , 

1 a log a. 

3 104 & 




FIG. 4. Answer depth as a function of a for quasi-perfect answers to questions on the finite parameter 
space (left) and infinite parameter space (right). 



Let us consider the second example from Q|. The parameter space is Q = [0, l] 2 CM 2 . Let the 
pseudotemperature function be u{ui) = |(w 2 +^2) ( so that the hard questions are located towards 
the upper-right corner of f2). Consider the following three subsets of Vt: C\ = {uj : u\ G [|, 1], W2 £ 
[|,1]}, C 2 = {u : Wl G [0,±],w 2 G [0,i]}, C 3 = { W :wiG [0,|],wa G [5,1]} and let Q = 
for j = 1,2,3 be three complete questions on f2. Let V(Ci) be a quasi-perfect answer to question 
Cj, i = 1,2,3 characterized by error probability a. We can use the expression (|22p to obtain the 
depth of these answers (see Fig. [4] for an illustration). 

Y(Q, C h P, V(d)) = ^ (l - |«) log (4 - 3a) + A A _ i a j i og + H Q l og a> 



r(0, C 2 , P, V(C 2 )) = ^ (l - |a) log (4 - 3a) + ^ A - ±a) log 



n 



-| a log a, 

32 & 
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7{Q,C 3f P t VjC x )) 

Y(n,c y p,v a{Cl )) 

Y(£l,C y P,V a (C 2 )) 



-Y(a,c v pv a{ c 2 )) 

-7(Q,C p P,K a (C 3 )) 

r(Q,c 2 ,p,K a( c 3 )) 



0.20 



0.15 



relative Y 



0.10 



0.05 




FIG. 5. Relative depth of quasi-perfect answers as a function of a. 



and 

F(fi, C 3 , P, V(C 3 )) = \U~ |«) log (4 - 3a) + | f 1 - \a^j log + ^a log a. 

Let us turn to relative depth of answers. Consider the above example again. The relative depth 
Y(n, C" , P,V a (C')) of a quasi-perfect answer V^(C') with respect to question C" can be readily 
computed using the expression (f35l) . We obtain, for questions Ci and C2, 

Y(n, C 2 , P, ^(d)) = Q - ^a) log Q(l - a) + a 



r(n, Ci, P, V a {C 2 )) = Q - la) log Q(l - a) + a 

A 5 An /^ 8 / n \ 7 , 

+ a log -(1 — a) + a H aloga. 

\2 64 / & V9 / 64 B 

Similar expressions obtain for the other two question pairs. The results are shown in Fig. 
We can see, in particular, that the relative depth Y(£l, C",P, V a (C)) is not in general symmetric 
in the two questions unless a = or a = 1. In the former case the relative depth reduces to 
the overlap J(f2, (C; C"), P) which is symmetric and in the latter case the relative depth simply 
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2 4 6 8 10 

FIG. 6. Sample questions. 



vanishes. Further, it can be seen from Fig. [5] that the relative depth can in fact be negative 
meaning that it is possible that the knowledge of an (imperfect) answer to a question may make 
another question more difficult. It would be interesting to establish general conditions under 
which relative depth is nonnegative. Another useful observation is that if for a pair of questions 
C and C" question C is the more difficult one of the two then it appears that the inequality 
Y(n,C",P,V a (C')) > Y(n,C',P,V a (C")) holds for < a < 1 implying that a quasi-perfect 
answer to a more difficult question result in a higher reduction of difficulty of the other question. 
It would be of interest to see if this property holds in the general case or exceptions are possible. 

To illustrate the process of estimating the pseudotemperature u(uj) and source model param- 
eters, consider an example in which $7 = [0, l] 2 C M?, and the measure P is uniform continuous 
on S7. Consider the set of sample (complete) questions illustrated in Fig. [6l Our goal is, given the 
error parameters a, for quasi-perfect answer V^(Cj) to question Cj, i = 1, ... ,10, estimate the 
function u(ui) and the parameter(s) of the chosen information source model. 

We adopt the modified linear source model and use formulation fi5j) to estimate u(ui), and 
parameters Y s and b of the model. We do this for different values of error probabilities. 

First consider data shown in Table HI In this and following tables, the first column contains 
the index i of question Cj from Fig. [6l the second column shows the corresponding error proba- 
bility «j, and the last two columns contain the question difficulty G(0, Cj,P) and answer depth 
Y(Tl, Cj, P, V ai (Cj)), respectively, obtained from the estimated values of u(u) and parameters of 
the source model. In the lower part of Table [TJ the resulting value of the objective of problem (|45h 
along with the estimated values of parameters Y s and b are shown. 

The error probability values shown in Table [J result in a perfect fit {z = 0) with the estimated 
pseudotemperature function u(oj) (shown in Fig. [7]). We can see that the resulting pseudotempera- 
ture function increases for the larger values of coordinates oj\ and UJ2 on Q reflecting the fact that, 
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TABLE I. Sample question error probabilities, fitted values of the difficulty and depth functions, and es- 
timated model parameter values for the modified linear model when perfect fit is possible. X)j=i z i = 0; 
C/ = 0.13; Y s =0.52; 6 = 0.76. 
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0.143 


0.803 


0.516 
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0.077 


0.533 


0.404 
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0.210 


1.000 


0.516 
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0.210 


1.000 


0.516 
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0.253 


1.102 


0.516 
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0.116 


0.761 


0.516 
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0.253 


1.102 


0.516 


10 


0.116 


0.761 


0.516 
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FIG. 7. The estimated pseudotemperature (left) and the fitted values of difficulty and depth (right) for the 
data of Table H 

for instance ot\ > a^, implying that question Ci has higher difficulty (larger value of pseudoen- 
ergy) than C4 in spite of these two questions having same value of entropy. This means that the 
smaller measure subset in Ci has to have higher pseudotemperature which we indeed see. It is also 
worth noting that questions C5 and Cq were answered with equal accuracy suggesting that these 
questions are of equal difficulty. This in fact is a necessary condition for a perfect fit within the 
ideal gas question difficulty model since in this model any complete question with all subsets of 
equal measure would have the same difficulty (pseudoenergy) regardless of the pseudotemperature 
function form. 
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TABLE II. Sample question error probabilities, fitted values of the difficulty and depth functions, and 
estimated model parameter values for the modified linear model when perfect fit is not possible, with small 
misfit. *i = 0-07; U = 0.43; Y s = 0.53; b = 0.74. 
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FIG. 8. The estimated pseudotemperature (left) and the fitted values of difficulty and depth (right) for the 
data of Table HH 

Consider now data shown in Table ILTl The resulting pseudotemperature u(u) is shown in Fig. [HI 
We see that in this case the perfect fit could not be achieved by any pseudotemperature function, 
in particular because questions C5 and Cq were answered with slightly different accuracy whereas 
these two questions necessarily have equal pseudoenergy content (equal difficulty) within the ideal 
gas question difficulty model. 

Now, consider the data shown in Table IIIIl As can be seen from Fig. [9l the fit that could be 
achieved to the ideal gas question difficulty model (with the linear modified information source 
model) is relatively (at least compared to the previous example) poor, possibly indicating that 
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TABLE III. Sample question error probabilities, fitted values of the difficulty and depth functions, and 
estimated model parameter values for the modified linear model when perfect fit is not possible, with larger 
misfit. z i = 1-51; U = 0.56; Y s = 1.28; b = 0.73. 
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FIG. 9. The estimated pseudotemperature (left) and the fitted values of difficulty and depth (right) for the 
data of Table Ell 

the ideal gas model may not be adequate in this case and that a different model (for example, 
anisotropic - to be able to model different pseudoenergy content of questions C5 and Cq) may be 
needed. 

Let us now turn to comparing different sources. Suppose f2 = [0, 1] with P being a uniform 
continuous measure on £2. Let sample questions be as follows. Ci = {[0, 1/2], (1/2, 1]}, C2 = 
{[0,1/3], (1/3,1]}, C 3 = {[0,2/3], (2/3,1]}, C 4 = {[0, 1/4], (1/4, 1]}, C B = {[0,3/4], (3/4, 1]}. Let 
source 1 accuracy be described by error probabilities (assuming quasi-perfect answers as before) 
shown in Table HVl Then, using the modified capacity model and formulation ()45|) . we can estimate 
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TABLE IV. Sample question error probabilities, fitted values of the difficulty and depth functions, estimated 
model parameter values for the modified linear model, for information source 
Y s = 0.74; b = 0.77. 
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a>i 


G(Sl,C u P) 


Y{Q,Ci,P,V ai {Ci)) 
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TABLE V. Sample question error probabilities, fitted values of the difficulty and depth functions, estimated 
model parameter values for the modified linear model, for information source 2. ^T, d Zi = 0.18; U = 0.56; 
Y s = 0.39; b = 0.74. 
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the pseudotemperature function u(-) and the model parameters Y s and b. The results - as well as 
fitted values of the question difficulty and answer depth - are shown in Table IIV1 

Table |V] shows error probabilities achieved on the same set of sample questions by a different 
source 2, along with the resulting fitted values of difficulty and depth functions and the estimated 
model parameter values. Looking at Tables ITVl and IVl we can see, for example, that source 1 shows 
better overall performance on all questions, but there exist questions (question 5, for instance) 
that appear to be easier for source 2. Indeed, the estimated pseudotemperature functions shown 
in Fig. [10] (in the unit source capacity convention) clearly demonstrate that the overall pseudotem- 
perature is significantly higher for source 2 thus making the majority of sample questions more 
difficult for it (which is reflected in higher error probabilities). On the other hand, while the pseu- 
dotemperature function for source 1 is (mostly) increasing on the interval [0, 1], it is a decreasing 
function on the same interval for source 2. In particular, there exist regions of f2 = [0, 1] where the 
pseudotemperature for source 2 is lower than that for source 1. This means that some questions 
can be easier for source 2, question 5 from the sample set being an example. 
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FIG. 10. Estimated pseudotemperature functions for information sources 1 and 2. 



XI. CONCLUSION 



This article is devoted to developing quantitative framework for information exchange between a 
problem solving agent and an information source. The latter is assumed to be capable of providing 
answers to the agent's questions. While the companion article [2| is mostly concerned with ques- 
tions, the main subject of the present article is answers and information source models. Questions 
can be characterized with the question difficulty functional which can be thought of as the amount 
of "work" the source would have to do in order to answer the particular question perfectly. The 
question difficulty functional is source-specific and reflects the knowledge structure of the source. 
The nature of geometric objects describing the knowledge structure is dictated by the symmetry ex- 
hibited by the latter. In particular, in an isotropic (linear) case, it is described by a scalar function 
while an anisotropic knowledge structure would likely be described - as indicated by a preliminary 
investigation - by a symmetric rank 2 tensor. The corresponding characterization of answers - the 
answer depth functional - is developed in the present article. It can be informally thought of as a 
measure of the amount of "work" the source is actually capable of doing in response to a particular 
question. The overall form of the answer depth functional reflects the source's knowledge structure 
and therefore largely parallels that of question difficulty. The value of answer depth is equal to 
that of question difficulty in case of a perfectly accurate answer and is less than question difficulty 
if the answer allows for errors. 

Information source models describe the relationship between answer depth and question diffi- 
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culty. It can be said that, while question difficulty reflects the source's knowledge structure, the 
information source model specifies the source's knowledge strength by relating the answer depth the 
source is capable of producing to the corresponding question difficulty. It is reasonable to expect 
that most sources would exhibit a sort of an upper bound on the achievable answer depth which 
could be identified with the source capacity. With regard to the latter, it is worth pointing out 
that this is not information capacity but rather pseudoenergy capacity which is the measure of 
maximum "work" the particular source is capable of. 

The framework for describing information exchange between the agent and information sources 
is developed as part of a theory of the Full Information Chain which is anticipated to be an ex- 
tension of the classical Information Theory in the direction of quantitative study of information 
accuracy and relevance attributes in addition to information quantity that is the main concern of 
Information Theory. The framework developed in [2| and the present article contains the basics of 
a theory of information acquisition, with information accuracy being the main attribute involved. 
The information usage link that deals with information relevance attribute is the subject of fu- 
ture publications. It should be noted that the information acquisition and usage appear to be 
fundamentally interconnected and have to be treated by a joint theory. Therefore any practical al- 
gorithms for optimal information acquisition will also be presented in future publications following 
a description of the information usage link. 
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